





# Macro Placement by Wire-Mask-Guided Black-Box Optimization

Yunqi Shi, Ke Xue, Lei Song, and Chao Qian\*

School of Artificial Intelligence Nanjing University, China

Advances in Neural Information Processing Systems 36 (NeurIPS'23)

### Why we are the best?



#### The problem we solve



Macro Placement

A vital stage in chip design

#### The baselines we beat







TCAD, DAC best paper



Significant improvement

(e.g., 80% wirelength improvement over [Google, Nature'21])

#### The contributions to community

- Bring EAs back to the state-ofthe-art for macro placement
- Reaffirm the potential of EAs for chip design







# Macro placement





#### Minimization objective: Half Perimeter Wire Length

$$HPWL(s, H) = \sum_{e_j \in E} (w_j + h_j)$$

Sum of the half perimeter of nets' bounding boxes

Non-differentiable!



#### **Constraint:**

No-overlapping

#### **High-dimensional:**

Thousands of macros









Human experts spend weeks for macro placement, owing to the very large number of macros to be placed and their complex connections



Design efficient algorithms producing better placement than human experts







Classical EAs 1980s-2000s

- Packing-based solution representation
- Optimize by Evolutionary Algorithms (EAs)

 $\mathbf{m}$ : number of macros

- $O(m^2)$  complexity for mapping [Murata et al., TCAD'96]
- Final performance heavily relies on the initial placement
- Low-efficiency and low-scalability

"... it is very slow and difficult to parallelize, thereby failing to scale to the increasingly large and complex circuits of the 1990s and beyond." [Google, Nature'21]

#### Previous methods





Classical EAs 1980s-2000s



**Analytical Placers 2000s-Now** 



- Optimize by gradient descent
- Overlapping
- Get stuck in local optima



#### Previous methods





Classical EAs 1980s-2000s



Analytical Placers
2000s-Now



Reinforcement Learning Methods 2021-Now



- Markov Decision Process
- Long training time
- Poor exploration

#### Previous methods





Classical EAs 1980s-2000s





Analytical Placers 2000s-Now

#### [Google, Nature'21]



Reinforcement Learning Methods 2021-Now



We propose a genotype-phenotype mapping inspired by RL formulation

**Bring EAs back to the state-of-the-art!** 





**Genotype representation** 

Phenotype representation





Partition the chip canvas into discrete grids





The first macro to place





The input netlist indicates connection relationships





















The second macro to place





Calculate the wirelength increment after placing the macro at each grid





Refer to its original location





Choose the nearest best grid
Place the macro





Place macro C similarly

Obtain the valid phenotype placement result

A greedy mapping based on the increment of wirelength



Improve the efficiency









# **Evolutionary Algorithm**

#### (1+1)-EA

- Initialization: Randomly generate 100 genotype solutions and pick the best
- Mutation: Randomly exchange two macros' locations

Already performs well!



# Optimization based on proposed mapping





# Any black-box optimization algorithm

(1+1)-EA

**Bayesian optimization** 

Random search



# Comparison with previous SOTA methods

Table 1: Wirelength ( $\times 10^5$ ) obtained by ten compared methods on seven popular ISPD contest chips.

| Classical EA | Method          | Type       | adaptec1         | adaptec2           | adaptec3           | adaptec4           | bigblue1         | bigblue3           | bigblue4 (×10 <sup>7</sup> ) | +/-/≈ | Avg. Rank |
|--------------|-----------------|------------|------------------|--------------------|--------------------|--------------------|------------------|--------------------|------------------------------|-------|-----------|
| method       | SP-SA [33]      | Packing    | $18.84 \pm 4.62$ | $117.36 \pm 8.73$  | $115.48 \pm 7.56$  | $120.03 \pm 4.25$  | $5.12 \pm 1.43$  | 164.70 ± 19.55     | $25.49 \pm 2.73$             | 0/7/0 | 6.86      |
|              | NTUPlace3 [12]  | Analytical | 26.62            | 321.17             | 328.44             | 462.93             | 22.85            | 455.53             | 48.38                        | 0/7/0 | 9.00      |
| Analytical   | RePlace [13]    | Analytical | $16.19 \pm 2.10$ | $153.26 \pm 29.01$ | $111.21 \pm 11.69$ | $37.64 \pm 1.05$   | $2.45 \pm 0.06$  | $119.84 \pm 34.43$ | $11.80 \pm 0.73$             | 1/6/0 | 5.28      |
| ,, <b>,</b>  | DREAMPlace [28] | Analytical | $15.81 \pm 1.64$ | $140.79 \pm 26.73$ | $121.94 \pm 25.05$ | $37.41 \pm 0.87$   | $2.44 \pm 0.06$  | $107.19 \pm 29.91$ | $12.29 \pm 1.64$             | 1/6/0 | 4.86      |
|              | Graph [32]      | RL         | $30.10 \pm 2.98$ | $351.71 \pm 38.20$ | $358.18 \pm 13.95$ | $151.42 \pm 9.72$  | $10.58 \pm 1.29$ | $357.48 \pm 47.83$ | $53.35 \pm 4.06$             | 0/7/0 | 9.00      |
| RL           | DeepPR [15]     | RL         | $19.91 \pm 2.13$ | $203.51 \pm 6.27$  | $347.16 \pm 4.32$  | $311.86 \pm 56.74$ | $23.33 \pm 3.65$ | $430.48 \pm 12.18$ | $68.30 \pm 4.44$             | 0/7/0 | 8.86      |
|              | MaskPlace [26]  | RL         | $6.38 \pm 0.35$  | $73.75 \pm 6.35$   | $84.44 \pm 3.60$   | $79.21 \pm 0.65$   | $2.39 \pm 0.05$  | $91.11 \pm 7.83$   | $11.07 \pm 0.90$             | 0/7/0 | 4.28      |
| Our          | WireMask-RS     | Ours       | $6.13 \pm 0.05$  | $59.28 \pm 1.48$   | $60.60 \pm 0.45$   | $62.06 \pm 0.22$   | $2.19 \pm 0.01$  | $62.58 \pm 2.07$   | $8.20 \pm 0.17$              | 0/5/2 | 2.57      |
|              | - WireMask-BO   | Ours       | $6.07 \pm 0.14$  | $59.17 \pm 3.94$   | $61.00 \pm 2.08$   | $63.86 \pm 1.01$   | $2.14 \pm 0.03$  | $67.48 \pm 6.49$   | $8.62 \pm 0.18$              | 0/3/4 | 2.86      |
| methods      | WireMask-EA     | Ours       | $5.91 \pm 0.07$  | $52.63 \pm 2.23$   | $57.75 \pm 1.16$   | $58.79 \pm 1.02$   | $2.12 \pm 0.01$  | $59.87 \pm 3.40$   | $8.28 \pm 0.25$              |       | 1.43      |

[Google, Nature'21]

- Some important
- Graph [Nature'21]: RL method proposed by Google
- tant DREAMPlace [DAC'19, TCAD'21 Best Paper] : One of the most popular analytical methods
- baselines DeepPR [NeurIPS'21] and MaskPlace [NeurIPS'22]: Two recent advanced RL methods

WireMask-EA (our proposed framework equipped with EA) achieves the best average rank, and reduces wirelength by 80% compared to [Google, Nature'21]



### Comparison with the latest method ChiPFormer [Lai et al., ICML'23]

Table 2: Wirelength ( $\times 10^5$ ) Compared with ChiPFormer on ten popular ISPD and ICCAD contest chips.

| Benchmark | ChiPFormer (1)     | WireMask-EA (1)    | ChiPFormer (0.3k) | Wi | reMask-EA (0.3k) | ChiPFormer (2k)  | WireMask-EA (2k) |
|-----------|--------------------|--------------------|-------------------|----|------------------|------------------|------------------|
| adaptec1  | $8.87 \pm 0.98$    | $7.20 \pm 0.34$    | $7.02 \pm 0.11$   |    | $6.29 \pm 0.07$  | $6.62 \pm 0.05$  | $5.96 \pm 0.08$  |
| adaptec2  | $122.37 \pm 22.61$ | $111.04 \pm 20.09$ | $70.42 \pm 2.67$  |    | $61.25 \pm 4.10$ | $67.10 \pm 5.46$ | $53.88 \pm 2.53$ |
| adaptec3  | $107.11 \pm 8.84$  | $75.37 \pm 2.93$   | $78.32 \pm 2.03$  |    | $64.49 \pm 1.69$ | $76.70 \pm 1.15$ | $59.26 \pm 1.30$ |
| adaptec4  | $85.63 \pm 7.52$   | $75.63 \pm 1.30$   | $69.42 \pm 0.54$  |    | $64.52 \pm 1.81$ | $68.80 \pm 1.59$ | $59.52 \pm 1.71$ |
| bigblue1  | $3.11 \pm 0.03$    | $2.31 \pm 0.06$    | $2.96 \pm 0.04$   |    | $2.18 \pm 0.01$  | $2.95 \pm 0.04$  | $2.14 \pm 0.01$  |
| bigblue3  | $131.78 \pm 17.36$ | $99.20 \pm 24.69$  | $81.48 \pm 4.83$  |    | $64.51 \pm 4.15$ | $72.92 \pm 2.56$ | $56.65 \pm 2.81$ |
| ibm01     | $4.57 \pm 0.27$    | $3.76 \pm 0.36$    | $3.61 \pm 0.08$   |    | $2.92 \pm 0.07$  | $3.05 \pm 0.11$  | $2.39 \pm 0.07$  |
| ibm02     | $6.01 \pm 0.41$    | $5.13 \pm 0.16$    | $4.84 \pm 0.17$   |    | $3.86 \pm 0.03$  | $4.24 \pm 0.25$  | $3.56 \pm 0.05$  |
| ibm03     | $2.15 \pm 0.17$    | $3.10 \pm 0.12$    | $1.75 \pm 0.07$   |    | $2.20 \pm 0.11$  | $1.64 \pm 0.06$  | $1.69 \pm 0.11$  |
| ibm04     | $5.00 \pm 0.14$    | $3.60 \pm 0.17$    | $4.19 \pm 0.11$   |    | $2.93 \pm 0.11$  | $4.06 \pm 0.13$  | $2.62 \pm 0.04$  |

WireMask-EA outperforms ChiPFormer [Lai et al., ICML'23] on 9 out of 10 chips, using the same number of evaluations





| circuit | method     | Tim       | ing    | NVP <sup>1</sup> | Congestion |      |
|---------|------------|-----------|--------|------------------|------------|------|
|         |            | WNS/ps    | TNS/ns |                  | H/%        | V/%  |
| C1      | Human      | 204       | 57.1   | 2569             | 0.06       | 0.38 |
|         | MaskPlace  | 161       | 42.7   | 1964             | 0.07       | 0.07 |
|         | ChiPFormer | 142       | 19.4   | 1636             | 0.04       | 0.07 |
| C2      | Human      | 403       | 492.2  | 11360            | 0.63       | 2.05 |
|         | MaskPlace  | 242       | 259.1  | 9710             | 0.57       | 1.67 |
|         | ChiPFormer | 177       | 224.9  | 8110             | 0.53       | 1.27 |
| C3      | Human      | 102       | 91.9   | 5614             | 1.02       | 0.85 |
|         | MaskPlace  | 116       | 92.8   | 5559             | 1.05       | 0.87 |
|         | ChiPFormer | 108       | 91.2   | 5452             | 1.02       | 0.82 |
|         | Human      | 399       | 438.0  | 13925            | 0.97       | 0.34 |
| C4      | MaskPlace  | 389       | 324.2  | 12582            | 0.68       | 0.34 |
|         | ChiPFormer | 248       | 266.0  | 12398            | 0.62       | 0.34 |
| C5      | Human      | 89        | 10.8   | 2675             | 0.02       | 0.07 |
|         | MaskPlace  | 122       | 32.2   | 2975             | 0.02       | 0.22 |
|         | ChiPFormer | 80        | 4.9    | 1706             | 0.02       | 0.04 |
| C6      | Human      | 154       | 137.4  | 6833             | 0.70       | 0.22 |
|         | MaskPlace  | 81        | 49.6   | 7040             | 0.77       | 0.26 |
|         | ChiPFormer | <b>78</b> | 38.1   | 6412             | 0.63       | 0.22 |

Experiments on **private industry chip cases:** 

ChiPFormer **outperforms** human experts (who place the macros with the help of commercial tool *Candence Innovus*)

WireMask-EA > ChiPFormer > Human expert

[Lai et al., ICML'23]







Most popular analytical placer DAC'19 Best Paper TCAD'21 Best Paper

**DREAMPlace** 



#### WireMask-EA

Our proposed framework equipped with **EA** 

Red points in the figure represent congestion points that can't be routed

Our results have much fewer red points, showing better routability and performance

## Why we are the best?







Macro Placement

A vital stage in chip design

#### The baselines we beat







TCAD, DAC best paper



#### Significant improvement

(e.g., 80% wirelength improvement over [Google, Nature'21])

#### The contributions to community

- Bring EAs back to the state-ofthe-art for macro placement
- Reaffirm the potential of EAs for chip design





# Reinforcement Learning within Tree Search for Fast Macro Placement

[Geng et al., ICML'24]

Shi, Y., Xue, K., Song, L., and Qian, C. Macro placement by wire-mask-guided black-box optimization. In *Thirtyseventh Conference on Neural Information Processing Systems*, 2023. Wiremask for Reducing Search Space The concept of wiremask was introduced by Lai et al. (2022) as the visual inputs to the neural networks. Shi et al. (2023) then employed wiremask to devise a greedy policy to guide the BBO algorithms. Similar to them, we restrict actions to the grid areas with the minimal HPWL increment, which narrows down the search space, thereby significantly enhancing the training efficiency and the placement quality.

Inspired by our work, **EfficientPlace** chooses the action **from the feasible grids with minimum wirelength increment**, **originated from our genotype-phenotype mapping**, enhancing RL searching

## Why we are the best?



#### The problem we solve



Macro Placement

A vital stage in chip design

#### The baselines we beat







TCAD, DAC best paper



#### Significant improvement

(e.g., 80% wirelength improvement over [Google, Nature'21])

#### The contributions to community

- Bring EAs back to the state-ofthe-art for macro placement
- Reaffirm the potential of EAs for chip design

#### We are working on

 Benchmark chip design cases as real-world problems for EAs

# Thanks for listening!